Analysis of the Effect of Unexpected Outliers in the Classification of Spectroscopy Data

نویسندگان

  • Frank G. Glavin
  • Michael G. Madden
چکیده

Multi-class classification algorithms are very widely used, but we argue that they are not always ideal from a theoretical perspective, because they assume all classes are characterised by the data, whereas in many applications, training data for some classes may be entirely absent, rare, or statistically unrepresentative. We evaluate onesided classifiers as an alternative, since they assume that only one class (the target) is well characterised. We consider a task of identifying whether a substance contains a chlorinated solvent, based on its chemical spectrum. For this application, it is not really feasible to collect a statistically representative set of outliers, since that group may contain anything apart from the target chlorinated solvents. Using a new one-sided classification toolkit, we compare a One-Sided k-NN algorithm with two wellknown binary classification algorithms, and conclude that the one-sided classifier is more robust to unexpected outliers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction Package CircOutlier For Detection of Outliers in Circular-Circular Regression

One of the most important problem in any statistical analysis is the existence of unexpected observations. Some observations are not a part of the study and are known as outliers. Studies have shown that the outliers affect to the performance of statistical standard methods in models and predictions. The point of this work is to provide a couple of statistical package in R software to identi...

متن کامل

The Unexpected Effect of Sodium Arsenate on the Interaction between Histone H1 and Sodium N-Dodecyl Sulphate

A Study was made on the interaction between histon H1 and sodium n-dodecyl sulphate (SDS) in the presence of sodium arsenate inside a phosphate buffer of pH 6.4, using spectroscopy and equilibrium dialysis at 27 °C. The binding data has been used to obtain the gibbs free energy in terms of a theoretical model based on the Wyman binding potential. The binding data hs been analysed...

متن کامل

Metabolomics-Based Study of Logarithmic and Stationary Phases of Promastigotes in Leishmania major by 1H NMR Spectroscopy

Background: Cutaneous leishmaniasis is one of the most important parasitic diseases in humans. In this disease, one of the responsible organisms is Leishmania major, which is transmitted by sandfly vector. There are specific differences in biochemical profiles and metabolite pathways in logarithmic and stationary phases of Leishmania parasites. In the present study, 1H NMR spectroscopy was used...

متن کامل

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Impact of Outliers in Data Envelopment ‎Analysis‎

This paper will examine the relationship between "Data Envelopment Analysis" and a statistical concept ``Outlier". Data envelopment analysis (DEA) is a method for estimating the relative efficiency of decision making units (DMUs) having similar tasks in a production system by multiple inputs to produce multiple ‎outputs.‎ An important issue in statistics is to identify the outliers. In this pap...

متن کامل

Investigation of outliers of evaluation scores among school of health instructors using outlier - determination indices

Introduction: Teacher evaluation, as an important strategyfor improving the quality of education, has been considered byuniversities and leads to a better understanding of the strengthsand weaknesses of education. Analysis of instructors’ scoresis one of the main fields of educational research. Since outliersaffect analysis and interpretation of information processes bothstructurally and concep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009